The design and implementation of FlexiMatch a learning , flexible & extendible framework for matching schemas
نویسندگان
چکیده
Automating matching schemas has been under investigation in many areas for already some decades, but matching schemas is still often done manually by domain experts. Due to the rapidly increasing number of heterogeneous and distributed data sources in enterprises and on the web, the manual matching approach is more and more a limitation and the need for automating the schema matching process is increasingly important. At the start of this report we investigate the different approaches in the schema matching world. From evaluations of implementations based on these approaches, schema matching 'trends' and problems are discerned, which are taken as the starting point of this thesis. This report describes the schema matching framework FlexiMatch, which supports the multi-strategy approach, with each strategy represented as a Validator. Key characteristics of FlexiMatch are that: • FlexiMatch and its Validator-components can learn from previous mappings. • Validators can easily be added to, or selected from the Validator repository, in order to boost future matching performance or to adapt the system to the match task at hand. The main perspective of FlexiMatch, is that the elements of schemas (relational column and table elements, or XML elements and attributes) from a certain domain share domain concepts. FlexiMatch learns these concepts from previous mappings. Schema elements belonging to a certain domain concept can have various representations. Within FlexiMatch concepts are therefore represented by interconnected subconcepts. Such an interconnected subconcept group is derived from the different schema elements representations of a certain domain concept which FlexiMatch encountered during previous schema mappings. These subconcepts and their interrelations are used as an intermediate schema to derive matches between input schema elements. Schema elements are therefore first matched with subconcepts. Schema elements that are matched with similar, interrelated subconcepts are then combined with each other. The main goal of the FlexiMatch system for this thesis is to make a learning framework in which the gained knowledge could be used in future match tasks. Although there are many aspects of FlexiMatch that could be improved or implement to enhance the performance, the evaluation of FlexiMatch shows that the main goal is achieved: subconcepts and subconcept relations are learned from previous mappings, and they help in finding new ones in later match tasks.
منابع مشابه
Process Capability Studies in an Automated Flexible Assembly Process: A Case Study in an Automotive Industry
Statistical Process Control (SPC) methods can significantly increase organizational efficiency if appropriately used. The primary goal of process capability studies is to obtain critical information about processes to render them even more effective. This paper proposes a comprehensive framework for proper implementation of SPC studies, including the design of the sampling procedure and interva...
متن کاملTowards Extendible, Component Based VR/AR Simulation Engine Featuring Advanced Virtual Character Technologies
This paper presents the architecture of the VHD++ real-time development framework that after several years of intensive research, design, and development effort has been released and enters its validation phase. This paper discusses the key aspects involved in architectural structure, design and practical implementation of an efficient, flexible and extendible real-time software framework based...
متن کاملMeta-Analysis of Studies on the Effect of Blended Learning on Academic Performance in Iran
The purpose of this research was to conduct a meta-analysis of the studies on the effect of blended learning on academic performance in Iran. The meta-analysis was based on the estimated effect size of blended learning on academic performance. 211 studies were identified in the period 2010-2017, of which 20 research documents were selected using non-probability (purposive) sampling. Initial dat...
متن کاملAn Improved Semantic Schema Matching Approach
Schema matching is a critical step in many applications, such as data warehouse loading, Online Analytical Process (OLAP), Data mining, semantic web [2] and schema integration. This task is defined for finding the semantic correspondences between elements of two schemas. Recently, schema matching has found considerable interest in both research and practice. In this paper, we present a new impr...
متن کاملA Controller Design with ANFIS Architecture Attendant Learning Ability for SSSC-Based Damping Controller Applied in Single Machine Infinite Bus System
Static Synchronous Series Compensator (SSSC) is a series compensating Flexible AC Transmission System (FACTS) controller for maintaining to the power flow control on a transmission line by injecting a voltage in quadrature with the line current and in series mode with the line. In this work, an Adaptive Network-based Fuzzy Inference System controller (ANFISC) has been proposed for controlling o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006